Hands-on Exercise 2.2: 2nd Order Spatial Point Patterns Analysis Methods

Author

Nguyen Bao Thu Phuong

Published

August 30, 2024

Modified

September 3, 2024

1 Overview

Spatial Point Pattern Analysis examines the pattern or distribution of points on a surface. These points can represent locations of events such as crimes, traffic accidents, or disease outbreaks, as well as business services (like coffee shops and fast food outlets) or facilities like childcare and eldercare centers.

In this hands-on exercise, we will use functions from the spatstat package to explore the spatial distribution of childcare centers in Singapore.

The key questions we aim to answer are:

  1. Are the childcare centers in Singapore randomly distributed across the country?
  2. If not, where are the areas with a higher concentration of childcare centers?

2 Data Acquisition

The datasets required for this exercise are extracted from the below public data sources:

  • CHILDCARE: A point feature dataset that provides both location and attribute information of childcare centers. This dataset was downloaded from data.gov.sg in GeoJSON format.

  • MP14_SUBZONE_WEB_PL: A polygon feature dataset that contains information on the URA 2014 Master Plan Planning Subzone boundaries. This dataset is in ESRI Shapefile format and was also downloaded from data.gov.sg.

  • CostalOutline: A polygon feature dataset that shows the national boundary of Singapore. This dataset is provided by the Singapore Land Authority (SLA) and is in ESRI Shapefile format.

3 Import R Packages

p_load() of pacman package is used to install and load sf, tmap, tidyverse, spatstat and raster packages into R environment.

pacman::p_load(sf, raster, spatstat, tmap, tidyverse)

4 Spatial Data Wrangling

4.1 Import Spatial Data

First we use st_read() of sf package used to import these three geospatial data sets into R.

As the childcare_sf simple feature data frame is in wgs84 geodetic CRS, which is not suitable for geospatial analysis, st_transform() of sf package is used to reproject the data frame to svy21 at the same time of import using below code chunk.

childcare_sf <- st_read("data/child-care-services-geojson.geojson") %>%
  st_transform(crs = 3414)
Reading layer `child-care-services-geojson' from data source 
  `C:\thuphuong1110\ISSS626-GAA\Hands-on_Ex\Hands-on_Ex02\data\child-care-services-geojson.geojson' 
  using driver `GeoJSON'
Simple feature collection with 1545 features and 2 fields
Geometry type: POINT
Dimension:     XYZ
Bounding box:  xmin: 103.6824 ymin: 1.248403 xmax: 103.9897 ymax: 1.462134
z_range:       zmin: 0 zmax: 0
Geodetic CRS:  WGS 84

We re-check the crs using below code chunk. The EPSG already reflects 3414 as expected.

st_crs(childcare_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

Geospatial data is imported using below code chunk.

sg_sf <- st_read(dsn = "data", layer = "CostalOutline")
Reading layer `CostalOutline' from data source 
  `C:\thuphuong1110\ISSS626-GAA\Hands-on_Ex\Hands-on_Ex02\data' 
  using driver `ESRI Shapefile'
Simple feature collection with 60 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 2663.926 ymin: 16357.98 xmax: 56047.79 ymax: 50244.03
Projected CRS: SVY21

Check the predefined coordinate system of this simple feature data frame using st_crs() of sf package.

st_crs(sg_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["SVY21[WGS84]",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]

The last lines of the print shows that EPSG code 9001 is used instead of the correct EPSG code 3414 for coordinate reference system svy21. The correct EPSG code is assigned using st_set_crs() as below.

sg_sf = st_set_crs(sg_sf, 3414)
Warning: st_crs<- : replacing crs does not reproject data; use st_transform for
that

We check the CRS again.

st_crs(sg_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

The EPSG code is now 3414.

mpsz_sf <- st_read(dsn = "data", 
                layer = "MP14_SUBZONE_WEB_PL")
Reading layer `MP14_SUBZONE_WEB_PL' from data source 
  `C:\thuphuong1110\ISSS626-GAA\Hands-on_Ex\Hands-on_Ex02\data' 
  using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21

First, we check the predefined coordinate system of mpsz_sf simple feature data frame using st_crs().

st_crs(mpsz_sf)
Coordinate Reference System:
  User input: SVY21 
  wkt:
PROJCRS["SVY21",
    BASEGEOGCRS["SVY21[WGS84]",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ID["EPSG",6326]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["Degree",0.0174532925199433]]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["Degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["(E)",east,
            ORDER[1],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]],
        AXIS["(N)",north,
            ORDER[2],
            LENGTHUNIT["metre",1,
                ID["EPSG",9001]]]]

Output interpretation: The last lines of the print shows that EPSG code 9001 is used instead of the correct EPSG code 3414 for coordinate reference system svy21. The correct EPSG code is assigned to mpsz_sf data frame using st_set_crs() as below.

mpsz_sf <- st_set_crs(mpsz_sf,3414)

We check the CRS again.

st_crs(mpsz_sf)
Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

The EPSG code is now 3414.

4.2 Plot the Map from geospatial data sets

After verifying the coordinate reference system (CRS) of each geospatial dataset, it is helpful to plot a map to visualize their spatial patterns.

tm_shape(mpsz_sf) +
  tm_polygons() +
tm_shape(childcare_sf) +
  tm_dots()

Notice that all the geospatial layers share the same map extent, indicating that their coordinate reference systems and values are aligned to the same spatial context. This alignment is crucial for any geospatial analysis.

Alternatively, we can create a pin map using the code snippet below.

tmap_mode('view')
tmap mode set to interactive viewing
tm_shape(childcare_sf)+
  tm_dots()
tmap_mode('plot')
tmap mode set to plotting

In interactive mode, tmap uses the Leaflet for R API. The benefit of this interactive pin map is that it allows us to freely navigate and zoom in or out. Additionally, we can click on each point to query detailed information about that feature. Three background options of the online map layer are currently available: ESRI.WorldGrayCanvas, OpenStreetMap, and ESRI.WorldTopoMap, with ESRI.WorldGrayCanvas set as the default.

Note: Always switch back to plot mode after using the interactive map, as each interactive session consumes a connection. Additionally, to prevent issues when publishing on Netlify, keep to fewer than 10 interactive maps in a single RMarkdown document.

5 Geospatial Data wrangling

Although simple feature data frames are becoming increasingly popular compared to the sp package’s Spatial* classes, many geospatial analysis packages still require geospatial data in the sp package’s Spatial* format. In this section, you explore how to convert a simple feature data frame to an sp Spatial* class.

5.1 Convert sf data frames to sp’s Spatial* class

The code chunk below uses the as_Spatial() function from the sf package to convert the three geospatial data from simple feature data frames to sp Spatial* classes.

childcare <- as_Spatial(childcare_sf)
mpsz <- as_Spatial(mpsz_sf)
sg <- as_Spatial(sg_sf)
childcare
class       : SpatialPointsDataFrame 
features    : 1545 
extent      : 11203.01, 45404.24, 25667.6, 49300.88  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 2
names       :    Name,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           Description 
min values  :   kml_1, <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>ADDRESSBLOCKHOUSENUMBER</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSBUILDINGNAME</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSPOSTALCODE</th> <td>018989</td> </tr><tr bgcolor=""> <th>ADDRESSSTREETNAME</th> <td>1, MARINA BOULEVARD, #B1 - 01, ONE MARINA BOULEVARD, SINGAPORE 018989</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSTYPE</th> <td></td> </tr><tr bgcolor=""> <th>DESCRIPTION</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>HYPERLINK</th> <td></td> </tr><tr bgcolor=""> <th>LANDXADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor="#E3E3F3"> <th>LANDYADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor=""> <th>NAME</th> <td>THE LITTLE SKOOL-HOUSE INTERNATIONAL PTE. LTD.</td> </tr><tr bgcolor="#E3E3F3"> <th>PHOTOURL</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSFLOORNUMBER</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>08F73931F4A691F4</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20200826094036</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSUNITNUMBER</th> <td></td> </tr></table></center> 
max values  : kml_999,                  <center><table><tr><th colspan='2' align='center'><em>Attributes</em></th></tr><tr bgcolor="#E3E3F3"> <th>ADDRESSBLOCKHOUSENUMBER</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSBUILDINGNAME</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSPOSTALCODE</th> <td>829646</td> </tr><tr bgcolor=""> <th>ADDRESSSTREETNAME</th> <td>200, PONGGOL SEVENTEENTH AVENUE, SINGAPORE 829646</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSTYPE</th> <td></td> </tr><tr bgcolor=""> <th>DESCRIPTION</th> <td>Child Care Services</td> </tr><tr bgcolor="#E3E3F3"> <th>HYPERLINK</th> <td></td> </tr><tr bgcolor=""> <th>LANDXADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor="#E3E3F3"> <th>LANDYADDRESSPOINT</th> <td>0</td> </tr><tr bgcolor=""> <th>NAME</th> <td>RAFFLES KIDZ @ PUNGGOL PTE LTD</td> </tr><tr bgcolor="#E3E3F3"> <th>PHOTOURL</th> <td></td> </tr><tr bgcolor=""> <th>ADDRESSFLOORNUMBER</th> <td></td> </tr><tr bgcolor="#E3E3F3"> <th>INC_CRC</th> <td>379D017BF244B0FA</td> </tr><tr bgcolor=""> <th>FMEL_UPD_D</th> <td>20200826094036</td> </tr><tr bgcolor="#E3E3F3"> <th>ADDRESSUNITNUMBER</th> <td></td> </tr></table></center> 
sg
class       : SpatialPolygonsDataFrame 
features    : 60 
extent      : 2663.926, 56047.79, 16357.98, 50244.03  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 4
names       : GDO_GID, MSLINK, MAPID,              COSTAL_NAM 
min values  :       1,      1,     0,             ISLAND LINK 
max values  :      60,     67,     0, SINGAPORE - MAIN ISLAND 
mpsz
class       : SpatialPolygonsDataFrame 
features    : 323 
extent      : 2667.538, 56396.44, 15748.72, 50256.33  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 15
names       : OBJECTID, SUBZONE_NO, SUBZONE_N, SUBZONE_C, CA_IND, PLN_AREA_N, PLN_AREA_C,       REGION_N, REGION_C,          INC_CRC, FMEL_UPD_D,     X_ADDR,     Y_ADDR,    SHAPE_Leng,    SHAPE_Area 
min values  :        1,          1, ADMIRALTY,    AMSZ01,      N, ANG MO KIO,         AM, CENTRAL REGION,       CR, 00F5E30B5C9B7AD8,      16409,  5092.8949,  19579.069, 871.554887798, 39437.9352703 
max values  :      323,         17,    YUNNAN,    YSSZ09,      Y,     YISHUN,         YS,    WEST REGION,       WR, FFCCF172717C2EAF,      16409, 50424.7923, 49552.7904, 68083.9364708,  69748298.792 

The geospatial data have been converted into their respective sp’s Spatial* classes.

5.2 Convert the Spatial* class into generic sp format

spatstat requires analytical data in ppp object form. There is no direct method to convert Spatial* classes into ppp objects, so we first need to convert Spatial* classes into generic sp objects.

The code chunk below performs this conversion.

childcare_sp <- as(childcare, "SpatialPoints")
sg_sp <- as(sg, "SpatialPolygons")

These sp objects properties are displayed as below.

childcare_sp
class       : SpatialPoints 
features    : 1545 
extent      : 11203.01, 45404.24, 25667.6, 49300.88  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
sg_sp
class       : SpatialPolygons 
features    : 60 
extent      : 2663.926, 56047.79, 16357.98, 50244.03  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 

Note: The are certain differences between Spatial* classes and generic sp object. Taking SpatialPointsDataFrame (Spatial* classes) and SpatialPoints (generic sp object) as an example:

  • SpatialPoints class: used to represent a simple collection of spatial points in a given coordinate system. This class focuses solely on the geometric aspect of spatial data, i.e., the locations of the points.

  • SpatialPointsDataFrame class: extends SpatialPoints by combining spatial coordinates with a data frame of attribute data. This class allows you to store both spatial and non-spatial (attribute) data together.

5.3 Convert the generic sp format into spatstat’s ppp format

Next ppp() function of spatstat is used to convert the SpatialPoints object into spatstat’s ppp object using 2 steps:

  • Extract the point coordinates from the SpatialPoints object.

  • Define the observation window for the ppp object, usually as a rectangle or polygon encompassing all the points.

# Extract the bounding box and point coordinates from the SpatialPoints object
bbox <- bbox(childcare_sp)
coords <- coordinates(childcare_sp)
# Define the observation window for the ppp object, usually as a rectangle or polygon encompassing all the points.
window <- owin(xrange = bbox[1, ], yrange = bbox[2, ])
# Convert SpatialPoints object to ppp using ppp()
childcare_ppp <- ppp(x = coords[, 1], y = coords[, 2], window = window)
Warning: data contain duplicated points
childcare_ppp
Planar point pattern: 1545 points
window: rectangle = [11203.01, 45404.24] x [25667.6, 49300.88] units

We plot childcare_ppp and examine the different.

plot(childcare_ppp)

We can see the subzone boundary is not shown and the points are displayed in overlapping points.

The summary statistics of the newly created ppp object is shown using the code chunk below.

summary(childcare_ppp)
Planar point pattern:  1545 points
Average intensity 1.91145e-06 points per square unit

*Pattern contains duplicated points*

Coordinates are given to 11 decimal places

Window: rectangle = [11203.01, 45404.24] x [25667.6, 49300.88] units
                    (34200 x 23630 units)
Window area = 808287000 square units

Note the warning message about duplicates. In spatial point pattern analysis, duplicates are a significant issue. The statistical methods used for spatial point pattern analysis often assume that the processes are simple, meaning that points should not overlap.

5.4 Handle duplicated points

We can check if the ppp object contain any duplicated point using below code chunk.

any(duplicated(childcare_ppp))
[1] TRUE

The multiplicity() function can be used to count the number of co-incident points.

multiplicity(childcare_ppp)

The number of locations having more than one point event is counted using the code chunk below.

sum(multiplicity(childcare_ppp) > 1)
[1] 128

The output indicates there are 128 duplicated point events.

To visualize the locations of these duplicate points, we plot the childcare data using the code chunk below.

tmap_mode('view')
tmap mode set to interactive viewing
tm_shape(childcare) +
  tm_dots(alpha=0.4, 
          size=0.05)
tmap_mode("plot")
tmap mode set to plotting

Note: As alpha defines the transparency of the dots, locations with darker dots (less transparent) indicates duplication since it have multiple points overlaying the same spot.

There are three ways to address this issue of duplicated points:

  • The simplest method is to delete the duplicates, but this could result in losing valuable point events.

  • The second option is to use jittering, which adds a small perturbation to the duplicate points so they no longer occupy the exact same location.

  • The third approach is to make each point “unique” and attach duplicates as marks or attributes to these points. This requires using analytical techniques that consider these marks.

The code chunk below implements the jittering approach.

childcare_ppp_jit <- rjitter(childcare_ppp, 
                             retry=TRUE, 
                             nsim=1, 
                             drop=TRUE)

We check again for duplication.

any(duplicated(childcare_ppp_jit))
[1] FALSE

The output is FALSE indicating there are no duplicated point in childcare_ppp_jit

5.5 Create owin object

When analyzing spatial point patterns, it is important to limit the analysis to a specific geographical area, such as the boundary of Singapore. In spatstat, an object called owin is specifically designed to represent such polygonal regions.

The code chunk below converts the sg simple feature object into an owin object for use in spatstat.

sg_owin <- as.owin(sg_sf)

Plot the output object using plot() function.

plot(sg_owin)

View the summary using summary() of Base R.

summary(sg_owin)

5.6 Combine point events object and owin object

In this final step of geospatial data wrangling, we use the below code chunk to extract childcare events that are located within Singapore boundary.

childcareSG_ppp = childcare_ppp[sg_owin]

The output object combines both the point and polygon features into a single ppp object class, as shown below.

summary(childcareSG_ppp)
Planar point pattern:  1545 points
Average intensity 2.129929e-06 points per square unit

*Pattern contains duplicated points*

Coordinates are given to 11 decimal places

Window: polygonal boundary
50 separate polygons (1 hole)
                 vertices         area relative.area
polygon 1 (hole)       30     -7081.18     -9.76e-06
polygon 2              55     82537.90      1.14e-04
polygon 3              90    415092.00      5.72e-04
polygon 4              49     16698.60      2.30e-05
polygon 5              38     24249.20      3.34e-05
polygon 6             976  23344700.00      3.22e-02
polygon 7             721   1927950.00      2.66e-03
polygon 8            1992   9992170.00      1.38e-02
polygon 9             330   1118960.00      1.54e-03
polygon 10            175    925904.00      1.28e-03
polygon 11            115    928394.00      1.28e-03
polygon 12             24      6352.39      8.76e-06
polygon 13            190    202489.00      2.79e-04
polygon 14             37     10170.50      1.40e-05
polygon 15             25     16622.70      2.29e-05
polygon 16             10      2145.07      2.96e-06
polygon 17             66     16184.10      2.23e-05
polygon 18           5195 636837000.00      8.78e-01
polygon 19             76    312332.00      4.31e-04
polygon 20            627  31891300.00      4.40e-02
polygon 21             20     32842.00      4.53e-05
polygon 22             42     55831.70      7.70e-05
polygon 23             67   1313540.00      1.81e-03
polygon 24            734   4690930.00      6.47e-03
polygon 25             16      3194.60      4.40e-06
polygon 26             15      4872.96      6.72e-06
polygon 27             15      4464.20      6.15e-06
polygon 28             14      5466.74      7.54e-06
polygon 29             37      5261.94      7.25e-06
polygon 30            111    662927.00      9.14e-04
polygon 31             69     56313.40      7.76e-05
polygon 32            143    145139.00      2.00e-04
polygon 33            397   2488210.00      3.43e-03
polygon 34             90    115991.00      1.60e-04
polygon 35             98     62682.90      8.64e-05
polygon 36            165    338736.00      4.67e-04
polygon 37            130     94046.50      1.30e-04
polygon 38             93    430642.00      5.94e-04
polygon 39             16      2010.46      2.77e-06
polygon 40            415   3253840.00      4.49e-03
polygon 41             30     10838.20      1.49e-05
polygon 42             53     34400.30      4.74e-05
polygon 43             26      8347.58      1.15e-05
polygon 44             74     58223.40      8.03e-05
polygon 45            327   2169210.00      2.99e-03
polygon 46            177    467446.00      6.44e-04
polygon 47             46    699702.00      9.65e-04
polygon 48              6     16841.00      2.32e-05
polygon 49             13     70087.30      9.66e-05
polygon 50              4      9459.63      1.30e-05
enclosing rectangle: [2663.93, 56047.79] x [16357.98, 50244.03] units
                     (53380 x 33890 units)
Window area = 725376000 square units
Fraction of frame area: 0.401

Plot the output object.

plot(childcareSG_ppp)

5.7 Extract study areas

The below code chunk is used to extract the 4 target planning areas.

pg <- mpsz_sf %>%
  filter(PLN_AREA_N == "PUNGGOL")
tm <- mpsz_sf %>%
  filter(PLN_AREA_N == "TAMPINES")
ck <- mpsz_sf %>%
  filter(PLN_AREA_N == "CHOA CHU KANG")
jw <- mpsz_sf %>%
  filter(PLN_AREA_N == "JURONG WEST")

Plot the target planning areas.

par(mfrow=c(2,2))
plot(pg, main = "Ponggol")
Warning: plotting the first 9 out of 15 attributes; use max.plot = 15 to plot
all

plot(tm, main = "Tampines")
Warning: plotting the first 9 out of 15 attributes; use max.plot = 15 to plot
all

plot(ck, main = "Choa Chu Kang")
Warning: plotting the first 10 out of 15 attributes; use max.plot = 15 to plot
all

plot(jw, main = "Jurong West")
Warning: plotting the first 9 out of 15 attributes; use max.plot = 15 to plot
all

5.7.1 Create owin object

We convert these sf objects into owin objects as required by spatstat.

pg_owin = as.owin(pg)
tm_owin = as.owin(tm)
ck_owin = as.owin(ck)
jw_owin = as.owin(jw)

5.7.2 Combine childcare points and the study area

We extract childcare centres within the selected regions for further analysis.

childcare_pg_ppp = childcare_ppp_jit[pg_owin]
childcare_tm_ppp = childcare_ppp_jit[tm_owin]
childcare_ck_ppp = childcare_ppp_jit[ck_owin]
childcare_jw_ppp = childcare_ppp_jit[jw_owin]

Next, rescale.ppp() function is used to transform the unit of measurement from metre to kilometre.

childcare_pg_ppp.km = rescale.ppp(childcare_pg_ppp, 1000, "km")
childcare_tm_ppp.km = rescale.ppp(childcare_tm_ppp, 1000, "km")
childcare_ck_ppp.km = rescale.ppp(childcare_ck_ppp, 1000, "km")
childcare_jw_ppp.km = rescale.ppp(childcare_jw_ppp, 1000, "km")

The code chunk below plot these four study areas and the locations of the childcare centres.

par(mfrow=c(2,2), mai = c(0.2,0.2,0.2,0.2))
plot(childcare_pg_ppp.km, main="Punggol")
plot(childcare_tm_ppp.km, main="Tampines")
plot(childcare_ck_ppp.km, main="Choa Chu Kang")
plot(childcare_jw_ppp.km, main="Jurong West")

6 Second-order Spatial Point Patterns Analysis

7 Analyze Spatial Point Process Using G-Function

The G function measures the distribution of distances from an arbitrary event to its nearest neighbor. In this section, you will learn how to compute the G-function estimation using Gest() and perform a Monte Carlo simulation test with envelope() from the spatstat package.

7.0.1 Compute G-function estimation

The code chunk below compute G-function using Gest() of spatat package.

G_CK = Gest(childcare_ck_ppp, correction = "border")
plot(G_CK, xlim=c(0,500))

7.0.2 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in Choa Chu Kang is random.

  • H1: The distribution of childcare services in Choa Chu Kang is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the G-function will be used for this analysis.

G_CK.csr <- envelope(childcare_ck_ppp, Gest, nsim = 999)
Generating 999 simulations of CSR  ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.

Done.

We plot the simulated G-function

plot(G_CK.csr)

7.0.3 Compute G-function estimation

G_tm = Gest(childcare_tm_ppp, correction = "best")
plot(G_tm)

7.0.4 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the G-function will be used for this analysis.

G_tm.csr <- envelope(childcare_tm_ppp, Gest, correction = "all", nsim = 999)
Generating 999 simulations of CSR  ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.

Done.
plot(G_tm.csr)

8 Analyze Spatial Point Process Using F-Function

The F function estimates the empty space function F(r) or its hazard rate h(r) from a point pattern within a window of any shape. In this section, you will learn to compute the F-function estimation using Fest() and perform a Monte Carlo simulation test with envelope() from the spatstat package.

8.0.1 Compute F-function estimation

The code chunk below computes F-function using Fest() of spatat package.

F_CK = Fest(childcare_ck_ppp)
plot(F_CK)

8.0.2 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the F-function will be used for this analysis.

F_CK.csr <- envelope(childcare_ck_ppp, Fest, nsim = 999)
Generating 999 simulations of CSR  ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.

Done.
plot(F_CK.csr)

8.0.3 Compute F-function estimation

The code chunk below computes F-function using Fest() of spatat package.

F_tm = Fest(childcare_tm_ppp, correction = "best")
plot(F_tm)

8.0.4 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the F-function will be used for this analysis.

F_tm.csr <- envelope(childcare_tm_ppp, Fest, correction = "all", nsim = 999)
Generating 999 simulations of CSR  ...
1, 2, 3, ......10.........20.........30.........40.........50.........60..
.......70.........80.........90.........100.........110.........120.........130
.........140.........150.........160.........170.........180.........190........
.200.........210.........220.........230.........240.........250.........260......
...270.........280.........290.........300.........310.........320.........330....
.....340.........350.........360.........370.........380.........390.........400..
.......410.........420.........430.........440.........450.........460.........470
.........480.........490.........500.........510.........520.........530........
.540.........550.........560.........570.........580.........590.........600......
...610.........620.........630.........640.........650.........660.........670....
.....680.........690.........700.........710.........720.........730.........740..
.......750.........760.........770.........780.........790.........800.........810
.........820.........830.........840.........850.........860.........870........
.880.........890.........900.........910.........920.........930.........940......
...950.........960.........970.........980.........990........
999.

Done.
plot(F_tm.csr)

9 Analyze Spatial Point Process Using K-Function

The K-function measures the number of events within a given distance of any event. In this section, we explore how to compute K-function estimates using Kest() and perform a Monte Carlo simulation test with envelope() from the spatstat package.

9.0.1 Compute K-function estimation

The code chunk below computes F-function using Fest() of spatat package.

K_ck = Kest(childcare_ck_ppp, correction = "Ripley")
plot(K_ck, . -r ~ r, ylab= "K(d)-r", xlab = "d(m)")

9.0.2 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the K-function will be used for this analysis.

K_ck.csr <- envelope(childcare_ck_ppp, Kest, nsim = 99, rank = 1, glocal=TRUE)
Generating 99 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
99.

Done.
plot(K_ck.csr, . - r ~ r, xlab="d", ylab="K(d)-r")

9.0.3 Compute K-function estimation

The code chunk below computes F-function using Fest() of spatat package.

K_tm = Kest(childcare_tm_ppp, correction = "Ripley")
plot(K_tm, . -r ~ r, 
     ylab= "K(d)-r", xlab = "d(m)", 
     xlim=c(0,1000))

9.0.4 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the K-function will be used for this analysis.

K_tm.csr <- envelope(childcare_tm_ppp, Kest, nsim = 99, rank = 1, glocal=TRUE)
Generating 99 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
99.

Done.
plot(K_tm.csr, . - r ~ r, 
     xlab="d", ylab="K(d)-r", xlim=c(0,500))

10 Analyze Spatial Point Process Using L-Function

In this section, we explore how to compute L-function estimation using Lest() of spatstat package and perform Monte Carlo simulation test using envelope() of spatstat package.

10.0.1 Compute L-function estimation

The code chunk below computes L-function using Lest() of spatat package.

L_ck = Lest(childcare_ck_ppp, correction = "Ripley")
plot(L_ck, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)")

10.0.2 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the L-function will be used for this analysis.

L_ck.csr <- envelope(childcare_ck_ppp, Lest, nsim = 99, rank = 1, glocal=TRUE)
Generating 99 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
99.

Done.
plot(L_ck.csr, . - r ~ r, xlab="d", ylab="L(d)-r")

10.0.3 Compute L-function estimation

The code chunk below computes L-function using Lest() of spatat package.

L_tm = Lest(childcare_tm_ppp, correction = "Ripley")
plot(L_tm, . -r ~ r, 
     ylab= "L(d)-r", xlab = "d(m)", 
     xlim=c(0,1000))

10.0.4 Perform Complete Spatial Randomness Test

To confirm the observed spatial patterns, a hypothesis test will be conducted with the following hypotheses:

  • H0: The distribution of childcare services in the selected area is random.

  • H1: The distribution of childcare services in the selected area is not random.

The null hypothesis will be rejected if the p-value is smaller than the alpha level of 0.01.

A Monte Carlo test with the L-function will be used for this analysis.

L_tm.csr <- envelope(childcare_tm_ppp, Lest, nsim = 99, rank = 1, glocal=TRUE)
Generating 99 simulations of CSR  ...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 
99.

Done.
plot(L_tm.csr, . - r ~ r, 
     xlab="d", ylab="L(d)-r", xlim=c(0,500))